Understanding email
                           by John F. Zacharias

            (c) 1997 by John F. Zacharias, All Rights Reserved

One of the most popular uses of the Internet is it's ability to transmit
mail anywhere in the world.  Transmission is virtually instantaneous which
beats the more commonly used U.S.  Postal Service in which mail is often
referred to as "smail" or "snail mail" because of the long time it takes to
receive letters, especially from foreign sources .  Mail on the Internet is
referred to as "email" or "electronic mail" because it uses electronic
means to transfer the mail that is very fast.

To understand email you need to understand a few concepts and the
terminology of the Internet.  First of all you need to understand the
"client/server" concept.  A "client" is a program on your computer which
talks to a "server" (one that provides a service) program which is
generally on another computer.  It is the "server" which actually talks to
the Internet and provides various Internet services such as the ability to
transfer email, files, world wide web pages, etc.  throughout the world of
the Internet.

The "client" programs on your computer generally handles a single function
such as processing mail, transferring files, or reading world wide web
pages.  In this article I will occasional make reference to my email client
program, AEMail, and relate some of the concepts to it.

To talk to the "server" program, you need something called a TCP/IP stack.
TCP/IP stands for "Transmission Control Protocol/ Internet Protocol".  The
word "protocol", which you will see a lot with the Internet, is defined as
"a clearly defined set of rules by which two sides communicate" and is a
widely used term in the telecommunications arena.  For the Internet these
"protocols" are generally published as RFC's, or "Request for Comments".
Each RFC is numbered and describes the current accepted "protocol" for a
particular area.

The term "stack" refers to a special model that is used for describing
telecommunications.  This is a fairly lengthy concept that we will not go
into here.  The TCP/IP stack is provided by a special kind of
communication's program that resides on your computer.  Three widely used
TCP/IP stacks that are used on the Amiga are AmiTCP, Termite TCP, and
Miami.  Of these three programs AmiTCP is probably the hardest to use and
Miami is probably the easiest.

The "server" program is general provided by an ISP, or "Internet Service
Provider".  Since the "client" and "server" programs talk with commonly
known "protocols", each program can reside of any computer whether it be a
PC, Macintosh, Amiga, or a Unix based computer.  As long as a particular
ISP supports the particular protocol that is being used, your Amiga can
talk to it.  If you hear "we don't support the Amiga" from an ISP it only
means that they do not have anyone on their staff that knows anything about
the Amiga so they can not provide any assistance if you are having trouble
connecting to them.  If you can talk to them in Internet specific terms,
then they should be able to answer your questions.  In other words, don't
mention that you have an Amiga!

There is one other thing you need to know about the connection to your ISP.
There are three ways a home computer can talk to the ISP.  These are
generally referred to as shell, SLIP, or PPP accounts.  With a "shell"
account your computer does not have to know TCP/IP.  It can communicate
with the ISP with a simple terminal program.  However, in this case you
computer is nothing more than a "dumb terminal".  The "client" programs
actually reside on the ISP.  In this case you are limited in what you can
do and you need to know Unix, since most ISP shell accounts use a Unix
operating system interface.

Both PPP and SLIP can talk TCP/IP to your ISP which means that client
programs, such as email programs, web browsers, or file transfer programs
can reside on your Amiga.  PPP stands for "Point-to-Point Protocol" and
SLIP stands for "Serial Line Internet Protocol".  SLIP is an older protocol
and is not as reliable as PPP.  Most systems today use PPP.  Both of these
protocols allow you to dial up your ISP and then talk using TCP/IP.

The Internet is a global communication facility that talks to many
locations at once.  It is similar to the Postal Service in that you can
deposit your mail at any one location and it will be reliably delivered to
it's destination provided you have addressed it correctly.

Like mailing a letter, you need to have an address so that the Internet
knows where to deliver it.  Addresses on the Internet are a series of
numbers referred to as an IP address.  These numbers can identify a
location just as the city, state, and zip code on a letter can identify the
post office that a letter is to be delivered to.

Since IP addresses are nothing more than numbers they are very hard to
remember.  So the Internet has come up with something called a "domain"
name.  The domain is a name that your ISP has assigned to itself.  There
are certain rules that have to be followed for this name.  First of all the
name can not be used by someone else.  In the US, you will usually see the
domain name followed with a suffix like .com, .edu., or .org (which stand
for commercial organization, educational organization, or non-profit
organization).  In foreign countries the domain name generally has a suffix
which identifies the country (i.e., .uk, .fr, .dk, .de, etc.).

The Internet itself can only identify locations with the numeric IP
address.  However, each ISP has a "domain name service" (DNS) which can
look up a domain name and get it's corresponding IP address.  This means
you can use domain names in addresses and not worry about what the real IP
address is.  When you set up your TCP/IP stack software you may have to
give the IP addresses of your ISP's Domain Name Servers.  However smarter
TCP/IP software (such as Miami) can get these DNS IP addresses
automatically.

In order to receive email you will have an "email address".  This email
address looks like this:  username@domain-name.  The domain-name is
generally the domain name of your ISP.  Sometimes this domain name is
prepended with a computer name followed by a period; however, just the
domain name of the ISP usually is sufficient.  The username is a name
either chosen by you or assigned to you by your ISP.  This is how the ISP
identifies you.  You should check with you ISP to be sure what your email
address is.

When you send mail to someone you must know their email address also.  This
is how you address your "letters" to them just as the address on the
letters you send by snail mail are addressed to a particular recipient.
You can think of the part of the email address to the left of the "@" sign
as the same as your street address or P.O.  Box and the part to the address
to the right if the "@" as the city, state, and zip code which identifies
the post office (ISP) that handles your mail.  You can also specify a "real
name" which can be thought of as the name part of your address.  The real
name is placed either after the email address surrounded by parenthesis or
in front of the email address with the email address surrounded by less
than/greater than brackets (<......>).  You do not need a real name to
receive email, however.  The email address is sufficient.  AEMail uses the
real-name <email address> format in identifying who is sending the mail.

With the Internet there are two protocols that handle sending and receiving
mail.  These are the SMTP (Simple Mail Transfer Protocol) and the POP or
Post Office Protocol.  Mail is sent using the SMTP protocol.  This is also
the protocol that the Internet uses for transferring mail between two
different locations.  The POP protocol is used to transfer the mail from
your ISP to your client software.  With the POP protocol your ISP is able
to store your mail in mailboxes on the ISP's computer until you are ready
to retrieve it.  That is why you use the SMTP protocol to send mail and the
POP protocol to receive mail.

Your ISP may have two different servers to handle mail, the SMTP server to
handle the SMTP protocol and the POP server to handle the POP protocol, or
the same server may be used to handle both functions.  You usually have to
tell your email client software what the names of these servers are so it
can connect to the proper one to handle the function you want to perform
(either sending or receiving messages).  As an example, CalWeb (my ISP)
uses pop.calweb.com for its POP server and smtp.calweb.com for it's SMTP
server.  Some ISP use mail.domain-name to refer to both the POP and SMTP
servers.  Others have more exotic names.  You need to check with your ISP
for the proper names to be used for these services.

When you set up your email software you will need to know your email
address and the names of your POP and SMTP servers.

Just like letters that you compose and send by snail mail, email letters
are divided into several sections.  The first is the header for the mail.
It tells the Internet things such as who sent the mail, who it's going to,
the date and time it is being sent, and the subject of the letter.  These
header fields have specific proscribed formats that consist of an
identifier followed by a colon which is then followed by a field that
contains the contents of that particular header.  Some of these headers and
what they contain are:

Date:  This header contains the date and time the message was sent.  Since
the time can be relative to anywhere in the world it is followed by an
offset which can be subtracted or added to the time to get Greenwich Mean
time (GMT) (or some times referred to as "Universal Time" (UT))

    From: This identifies the email address of who is sending the mail.

    To: This identifies the email address of the recipient(s) of the mail.
        With email you can send the same mail to multiple recipients.  The
        email address of each recipient is separated by commas.

    cc: This identifies (by email address) who is to receive carbon copies
        of the mail.  Again, you can send carbon copies to multiple
        recipients.

    bcc: (Blank Carbon Copies).  This identifies who will receive what is
        called blank carbon copies.  When this header is used, the bcc:
        field will not show up in the messages received by the intended
        recipients of the message.

    Reply-To:  This is the email address of where replies are to be sent.
        It is possible that a person has two or more email accounts and
        they want the replies directed to only one of these accounts.

    Subject:  This is the subject of the email message.  If the message is
        a reply to a previous message, RE:  will appear in front of the
        subject.  This is usually done automatically by the email client
        software.  If you are forwarding a message you have received to
        another party, (fwd) will generally appear after the subject line.

    Organization:  This is informational only and identifies the
        organization that one belongs to.  It is not a necessary header.

The above headers are the ones that are generally used by people composing
and sending email.  Other headers are usually added as the message travels
through the Internet.  You client software may also include headers that
identifies the client software (the X-Mailer:  header) and the types of
attachments that are being placed on the message.  Your email client
software will probably normally hide these headers from you.  AEMail has
the ability to both hide the headers or show them as the user wishes.  With
AEMail you can also specify which headers you want shown.

The next section of an email message is the body.  It is here that you
compose the message that you want to send.  The header section and the body
of the message is always separated by one blank line.  With AEMail, the
headers are automatically created from information you provide on the
"compose message screen" and you don't have to worry about creating them
yourself.  A facility is provided, however, for adding additional headers
of the user's own choosing.

The body of the message is followed by a "signature" block.  This is a text
message that is appended to the message which the user sets up to identify
and give information about his or herself.  This could contain email
addresses or world wide web addresses that the user uses or maintains.  It
might also contain information on the organizations that the user belongs
to.  The "signature" block is optional.  The user can set up a common
signature block that is always appended to every message he or she sends.
In AEMail you can have multiple signature blocks set up.  You can then
specify which one is to be used with any particular email message.  You can
even suppress using the signature block on certain messages.

The above are the basic building blocks of an email message.  However, as
with snail mail letters, you might want to include attachments to your
message.  These attachments are usually files on your system that you might
want to include with your message.  These files can be text files, program
files, picture files, or sound files - in fact any type of file that your
system and the recipient's system can handle.

A standard has been developed for attachments for email called MIME or
"Multipurpose Internet Mail Extension".  With this type of attachment you
assign the file you wish to attach to a specific type and subtype.  The
types that are defined are:  text, message, multipart, application, image,
audio, and video.  Text is used for normal text type documents.  Subtypes
in this category are "plain", "enriched", and "richtext".  The one most
people would be using is "plain" for a normal ASCII document.

"Message" is a special category used when error messages are returned with
an attachment of the message that was in error.  "Multipart" is also a
special category that indicates that attachments are included and describes
the method in which they are separated.  Normally you do not have to worry
about this type since it is generated automatically when you specify
attachments.

"Application" is used when you want to attach programs, lha files, or other
binary type files.  Native word processing files should also use this type.
Subtypes within the "application" type are "octet-stream" (any binary type
file) and "postscript" (for documents in postscript format).

If you are attaching images files, sound files, or video files don't use
"application", but rather use the appropriate "image", "audio", or "video"
types.  If you use these types your email program will be able to interpret
the files correctly and display them as appropriate.

AEMail uses a special file called a "mailcap" file to tell it how to
display a particular type/subtype.  A "mailcap" file is a common Internet
type of file.  It specifies the various types/subtypes and the programs
that are used to display any particular type/subtype file.  For people
using AmigaDos 3.x, the program for displaying the attachments is by
default "multiview".  That's because multiview uses datatypes that can
automatically detect what sort of file is being displayed.  You do need to
have the datatype for that file type in your datatype directory, however.
In AEMail, you can use any program you want for displaying your attachments
of a particular type/subtype - it does not have to be multiview.

Another problem with email is the coding scheme that is used to represent
characters and binary data.  To understand this, we need to know a bit of
history.  In the early days of telecommunications (transmitting data over
phone lines), the protocols that were used (before the Internet and TCP/IP)
used a code called ASCII.  We still use that code today.  However, the
original ASCII was only seven bits.  An eighth bit was reserved for
something called "parity".  Parity was used to check the accuracy of the
transmission.

Seven bits only allows character codes for 128 characters.  This was
sufficient for English since 32 codes could be reserved for "control
characters" (used to control the transmission), 32 codes for numeric and
special characters, 32 characters for uppercase alphabetic characters and
some special characters, and 32 codes for lowercase alphabetic characters
and some special characters.  This worked fine for English since the extra
special character codes that were available were sufficient to handle all
of the special characters that were available on most keyboards.

However, once foriegn languages and their special needs were introduced,
128 character codes were not sufficient.  Other error checking schemes were
introduced and an extended ASCII character set was developed that utilizes
eight bits rather than seven bits.  The TCP/IP protocol can handle eight
bits; however some of the terminals accessing the TCP/IP networks (and the
protocols they use) can only handle the seven bit ASCII codes.

Email is generally used for transmitting text messages although with the
use of attachments, we can also send program files and other special types
of files that use binary or eight bit data.  Also, both standard seven bit
ASCII and the extended eight bit ASCII still require special codes for
"control characters".  This is to handle such things as line breaks
(carriage return and line feed characters) and tab characters.  These
"control characters" interfer with pure binary data.

Therefor, special coding schemes are required to send email messages in
order to handle sending extended ASCII and binary data.  The coding schemes
used with email include:

	Standard seven bit ASCII (referred to as 7-bit)
	Extended eight bit ASCII (referred to as 8-bit)
	Quoted-Printable
	Base64

Both "Quoted-Printable" and "Base 64" are used to send eight bit binary
data.  "Quoted-Printable" is generally used to send extended eight bit
ASCII on systems that can only handle standard seven bit ASCII.  Most of
the characters that used are sent as seven bit ASCII characters, but for
those special "foriegn characters", a special "escape" code is used and the
character is sent as a hexadecimal character.

"Base64" is a coding scheme that allows pure binary data to be sent.  In
AEMail the Base64 coding is referred to as "encoded binary".  In this
scheme three characters are encoded as four ASCII characters and placed in
lines limited to 76 characters.

All of these coding schemes can be used in both the body of a message and
in attachments; although some email clients can't handle Base64 for the
body of a message and this should be avoided (AEMail, starting with version
1.20, can).

You will have to specify which coding scheme you want to use for both your
message body and any particular attachment.  Generally speaking message
bodies and "text" and "message" attachments use either 7-bit, 8-bit, or
quoted-printable encoding.  Other attachment types use Base64 encoding.

Another type of coding is used for attachments which derives from older
schemes used for adding attachments to messages on BBSes.  This scheme does
not use MIME and is called UUENCODING.  This coding scheme looks similar to
Base64.  Attachments using this coding scheme do not have a well defined
set of headers like MIME attachments do.  The attachment is embedded in the
message and has one header line beginning with the word "BEGIN".  AEMail
can create and read UUENCODED attachments, but such attachments can not be
displayed.  They can, however, be saved as separate binary files.

If you are interested in using an email client program for the Internet,
you can obtain AEMail from my web page, http://www.calweb.com/~jzachar, or
from Aminet in the comm/mail section.  It can also be obtained from the
SACC library.  The latest version of AEMail is version 1.21 but a new
version, 1.30 is due out around the first of September.  The new version
will add clipboard support to the program.

AEMail is shareware.  The shareware fee is $30, but you can download a demo
copy from the sources listed above.  The demo copy will have certain
features disabled until AEMail is registered.